Goto

Collaborating Authors

 spectral concentration


Noise Stability of Transformer Models

Haris, Themistoklis, Zhang, Zihan, Yoshida, Yuichi

arXiv.org Machine Learning

Understanding simplicity biases in deep learning offers a promising path toward developing reliable AI. A common metric for this, inspired by Boolean function analysis, is average sensitivity, which captures a model's robustness to single-token perturbations. We argue that average sensitivity has two key limitations: it lacks a natural generalization to real-valued domains and fails to explain the "junta-like" input dependence we empirically observe in modern LLMs. To address these limitations, we propose noise stability as a more comprehensive simplicity metric. Noise stability expresses a model's robustness to correlated noise applied to all input coordinates simultaneously. We provide a theoretical analysis of noise stability for single-layer attention and ReLU MLP layers and tackle the multi-layer propagation problem with a covariance interval propagation approach. Building on this theory, we develop a practical noise stability regularization method. Experiments on algorithmic and next-token-prediction tasks show that our regularizer consistently catalyzes grokking and accelerates training by approximately 35% and 75% respectively. Simplicity Biases have been a promising direction of study in recent years (Shah et al., 2020; V a-sudeva et al., 2024; Bhattamishra et al., 2022) as they provide a unifying framework for generalization, interpretability and robustness. Neural networks, including Large Language Models (LLMs), often converge to the simplest possible functions that explain the training data.


Spectral Concentration at the Edge of Stability: Information Geometry of Kernel Associative Memory

Tamamori, Akira

arXiv.org Machine Learning

Recent advances using Kernel Logistic Regression (KLR) have demonstrated that learning can sculpt these landscapes to achieve capacities far exceeding classical limits [1-3]. Our previous phenomenological analysis identified a Ridge of Optimization where stability is maximized via a mechanism we termed Spectral Concentration, defined as a state where the weight spectrum exhibits a sharp hierarchy [4]. However, a deeper question remains: Why does the learning dynamics self-organize into this specific spectral state? Why does the system operate at the brink of instability? T o answer these questions, we must look beyond the Euclidean geometry of the weight parameters and consider the intrinsic geometry of the probability distributions they represent. This is the domain of Information Geometry [5]. In this work, we reinterpret the KLR Hopfield network as a statistical manifold equipped with a Fisher-Rao metric.

  curvature, spectral concentration, stability, (10 more...)
2511.23083
  Country: Asia > Japan (0.04)
  Genre: Research Report > New Finding (0.89)

Adaptive Parameter Optimization for Robust Remote Photoplethysmography

Morales, Cecilia G., Teh, Fanurs Chi En, Li, Kai, Agrawal, Pushpak, Dubrawski, Artur

arXiv.org Artificial Intelligence

Remote photoplethysmography (rPPG) enables contactless vital sign monitoring using standard RGB cameras. However, existing methods rely on fixed parameters optimized for particular lighting conditions and camera setups, limiting adaptability to diverse deployment environments. This paper introduces the Projection-based Robust Signal Mixing (PRISM) algorithm, a training-free method that jointly optimizes photometric detrending and color mixing through online parameter adaptation based on signal quality assessment. PRISM achieves state-of-the-art performance among unsupervised methods, with MAE of 0.77 bpm on PURE and 0.66 bpm on UBFC-rPPG, and accuracy of 97.3\% and 97.5\% respectively at a 5 bpm threshold. Statistical analysis confirms PRISM performs equivalently to leading supervised methods ($p > 0.2$), while maintaining real-time CPU performance without training. This validates that adaptive time series optimization significantly improves rPPG across diverse conditions.


Handling Missing Data via Max-Entropy Regularized Graph Autoencoder

Gao, Ziqi, Niu, Yifan, Cheng, Jiashun, Tang, Jianheng, Xu, Tingyang, Zhao, Peilin, Li, Lanqing, Tsung, Fugee, Li, Jia

arXiv.org Artificial Intelligence

Graph neural networks (GNNs) are popular weapons for modeling relational data. Existing GNNs are not specified for attribute-incomplete graphs, making missing attribute imputation a burning issue. Until recently, many works notice that GNNs are coupled with spectral concentration, which means the spectrum obtained by GNNs concentrates on a local part in spectral domain, e.g., low-frequency due to oversmoothing issue. As a consequence, GNNs may be seriously flawed for reconstructing graph attributes as graph spectral concentration tends to cause a low imputation precision. In this work, we present a regularized graph autoencoder for graph attribute imputation, named MEGAE, which aims at mitigating spectral concentration problem by maximizing the graph spectral entropy. Notably, we first present the method for estimating graph spectral entropy without the eigen-decomposition of Laplacian matrix and provide the theoretical upper error bound. A maximum entropy regularization then acts in the latent space, which directly increases the graph spectral entropy. Extensive experiments show that MEGAE outperforms all the other state-of-the-art imputation methods on a variety of benchmark datasets.